深度神经网络(DNNS)的边缘训练是持续学习的理想目标。但是,这受到训练所需的巨大计算能力的阻碍。硬件近似乘数表明,它们在获得DNN推理加速器中获得资源效率的有效性;但是,使用近似乘数的培训在很大程度上尚未开发。为了通过支持DNN培训的近似乘数来构建有效的资源加速器,需要对不同DNN体系结构和不同近似乘数进行彻底评估。本文介绍了近似值,这是一个开源框架,允许使用模拟近似乘数快速评估DNN训练和推理。近似值与TensorFlow(TF)一样用户友好,仅需要对DNN体系结构的高级描述以及近似乘数的C/C ++功能模型。我们通过使用GPU(AMSIM)上的基于基于LUT的近似浮点(FP)乘数模拟器来提高乘数在乘数级别的模拟速度。近似值利用CUDA并有效地将AMSIM集成到张量库中,以克服商业GPU中的本机硬件近似乘数的缺乏。我们使用近似值来评估使用LENET和RESNETS体系结构的小型和大型数据集(包括Imagenet)的近似乘数的DNN训练的收敛性和准确性。与FP32和BFLOAT16乘数相比,评估表明测试准确性相似的收敛行为和可忽略不计的变化。与训练和推理中基于CPU的近似乘数模拟相比,GPU加速近似值快2500倍以上。基于具有本地硬件乘数的高度优化的闭合源Cudnn/Cublas库,原始张量量仅比近似值快8倍。
translated by 谷歌翻译
基于新兴的非易失性记忆(NVM)设备基于内存的计算(CIM)体系结构,由于其高能量效率,具有深度神经网络(DNN)加速的巨大潜力。但是,NVM设备遭受了各种非理想性,尤其是由于设备的随机行为而导致的制造缺陷和周期到周期变化引起的设备对设备变化。因此,实际上映射到NVM设备的DNN权重可能显着偏离预期值,从而导致大量性能降解。为了解决这个问题,大多数现有的作品都集中在设备变化下的平均性能最大化。这个目标对于通用场景非常有效。但是对于关键安全应用,还必须考虑最差的案例性能。不幸的是,文献中很少探索这一点。在这项工作中,我们制定了确定在设备变化影响下CIM DNN加速器最差的问题的问题。我们进一步提出了一种方法,可以有效地找到高维空间中设备变化的特定组合,从而导致最差的性能。我们发现,即使设备变化很小,DNN的准确性也会大幅度下降,在部署CIM加速器中在安全至关重要的应用中引起担忧。最后,我们表明,令人惊讶的是,在扩展时,没有一种用于提高CIM加速器中平均DNN性能的现有方法非常有效,以增强最差的性能,并且需要进一步的研究来解决此问题。
translated by 谷歌翻译
近年来,自动对色素,非色素和脱发的非胸膜皮肤病变的分类引起了很多关注。但是,皮肤纹理,病变形状,脱位对比度,照明条件等的成像变化。阻碍了鲁棒的特征提取,从而影响分类精度。在本文中,我们提出了一个新的深神经网络,该网络利用输入数据进行鲁棒特征提取。具体而言,我们分析了卷积网络的行为(视野),以找到深度监督的位置,以改善特征提取。为了实现这一目标,首先,我们执行激活映射以生成对象掩码,突出显示对分类输出生成最重要的输入区域。然后,选择层的有效接收场的网络层与对象掩模中的近似对象形状相匹配,以作为我们进行深度监督的焦点。利用三个黑色素瘤检测数据集和两个白癜风检测数据集上的不同类型的卷积特征提取器和分类器,我们验证了新方法的有效性。
translated by 谷歌翻译
经验重播是深入增强学习(DRL)的重要组成部分,它可以存储经验并为代理商实时学习的经验。最近,优先的经验重播(PER)已被证明是强大的,并且在DRL代理中已广泛部署。但是,由于其频繁和不规则的内存访问,在传统的CPU或GPU架构上实施会造成大量的延迟开销。本文提出了一种硬件软件共同设计方法,以设计基于AMPER的相关内存(AM),并具有AM友好的优先采样操作。 Amper在保留学习绩效的同时,以PER中的Per取代了广泛使用的时间成本的基于Tree-Traversal的优先级抽样。此外,我们设计了基于AM的内存计算硬件体系结构,以通过利用并行的内存搜索操作来支持安珀。与GPU上的每次运行相比,Amper在在拟议的硬件上运行时,在拟议的硬件上运行55倍至270倍的延迟延迟时,显示出可比的学习表现。
translated by 谷歌翻译
对象视觉导航旨在基于代理的视觉观察来转向目标对象。非常希望合理地感知环境并准确控制代理。在导航任务中,我们引入了一个以代理为中心的关系图(ACRG),用于基于环境中的关系学习视觉表示。 ACRG是一种高效且合理的结构,包括两个关系,即物体之间的关系以及代理与目标之间的关系。一方面,我们设计了存储物体之间的相对水平位置的对象水平关系图(OHRG)。请注意,垂直关系不涉及OHRG,我们认为OHRG适合控制策略。另一方面,我们提出了代理 - 目标深度关系图(ATDRG),使代理能够将距离视为目标的距离。为了实现ATDRG,我们利用图像深度来表示距离。鉴于上述关系,代理可以察觉到环境和输出导航操作。鉴于ACRG和位置编码的全局功能构造的可视表示,代理可以捕获目标位置以执行导航操作。人工环境中的实验结果AI2-Thor表明ACRG显着优于看不见的检测环境中的其他最先进的方法。
translated by 谷歌翻译
磁共振光谱(MRS)是揭示代谢信息的无创工具。 1H-MRS的一个挑战是低信号噪声比(SNR)。为了改善SNR,一种典型的方法是用M重复样品进行信号平均(SA)。但是,数据采集时间相应地增加了M次,并且在公共环境M = 128时,完整的临床MRS SCAN大约需要10分钟。最近,引入了深度学习以改善SNR,但大多数人将模拟数据用作培训集。这可能会阻碍MRS应用程序,因为某些潜在差异(例如获取系统的缺陷)以及模拟和体内数据之间可能存在生理和心理条件。在这里,我们提出了一种新方案,该方案纯粹使用了现实数据的重复样本。深度学习模型,拒绝长期记忆(RELSTM),旨在学习从低SNR时间域数据(24 SA)到高SNR ONE(128 SA)的映射。对7个健康受试者,2名脑肿瘤患者和1名脑梗塞患者的体内脑光谱进行实验表明,仅使用20%的重复样品,RelstM的DeNoed Spectra可以为128 SA提供可比的代谢物。与最先进的低级别去核法相比,RELSTM在量化某些重要的生物标志物时达到了较低的相对误差和cram \'er-rao下限。总而言之,RELSTM可以在快速获取(24 SA)下对光谱进行高保真降级,这对MRS临床研究很有价值。
translated by 谷歌翻译
Classical methods for acoustic scene mapping require the estimation of time difference of arrival (TDOA) between microphones. Unfortunately, TDOA estimation is very sensitive to reverberation and additive noise. We introduce an unsupervised data-driven approach that exploits the natural structure of the data. Our method builds upon local conformal autoencoders (LOCA) - an offline deep learning scheme for learning standardized data coordinates from measurements. Our experimental setup includes a microphone array that measures the transmitted sound source at multiple locations across the acoustic enclosure. We demonstrate that LOCA learns a representation that is isometric to the spatial locations of the microphones. The performance of our method is evaluated using a series of realistic simulations and compared with other dimensionality-reduction schemes. We further assess the influence of reverberation on the results of LOCA and show that it demonstrates considerable robustness.
translated by 谷歌翻译
Implicit regularization is an important way to interpret neural networks. Recent theory starts to explain implicit regularization with the model of deep matrix factorization (DMF) and analyze the trajectory of discrete gradient dynamics in the optimization process. These discrete gradient dynamics are relatively small but not infinitesimal, thus fitting well with the practical implementation of neural networks. Currently, discrete gradient dynamics analysis has been successfully applied to shallow networks but encounters the difficulty of complex computation for deep networks. In this work, we introduce another discrete gradient dynamics approach to explain implicit regularization, i.e. landscape analysis. It mainly focuses on gradient regions, such as saddle points and local minima. We theoretically establish the connection between saddle point escaping (SPE) stages and the matrix rank in DMF. We prove that, for a rank-R matrix reconstruction, DMF will converge to a second-order critical point after R stages of SPE. This conclusion is further experimentally verified on a low-rank matrix reconstruction problem. This work provides a new theory to analyze implicit regularization in deep learning.
translated by 谷歌翻译
Users' physical safety is an increasing concern as the market for intelligent systems continues to grow, where unconstrained systems may recommend users dangerous actions that can lead to serious injury. Covertly unsafe text, language that contains actionable physical harm, but requires further reasoning to identify such harm, is an area of particular interest, as such texts may arise from everyday scenarios and are challenging to detect as harmful. Qualifying the knowledge required to reason about the safety of various texts and providing human-interpretable rationales can shed light on the risk of systems to specific user groups, helping both stakeholders manage the risks of their systems and policymakers to provide concrete safeguards for consumer safety. We propose FARM, a novel framework that leverages external knowledge for trustworthy rationale generation in the context of safety. In particular, FARM foveates on missing knowledge in specific scenarios, retrieves this knowledge with attribution to trustworthy sources, and uses this to both classify the safety of the original text and generate human-interpretable rationales, combining critically important qualities for sensitive domains such as user safety. Furthermore, FARM obtains state-of-the-art results on the SafeText dataset, improving safety classification accuracy by 5.29 points.
translated by 谷歌翻译
Active galactic nuclei (AGN) are supermassive black holes with luminous accretion disks found in some galaxies, and are thought to play an important role in galaxy evolution. However, traditional optical spectroscopy for identifying AGN requires time-intensive observations. We train a convolutional neural network (CNN) to distinguish AGN host galaxies from non-active galaxies using a sample of 210,000 Sloan Digital Sky Survey galaxies. We evaluate the CNN on 33,000 galaxies that are spectrally classified as composites, and find correlations between galaxy appearances and their CNN classifications, which hint at evolutionary processes that affect both galaxy morphology and AGN activity. With the advent of the Vera C. Rubin Observatory, Nancy Grace Roman Space Telescope, and other wide-field imaging telescopes, deep learning methods will be instrumental for quickly and reliably shortlisting AGN samples for future analyses.
translated by 谷歌翻译